Support RHAIIS images for the e2e tests #2032

dhuangnm · 2025-11-13T23:49:02Z

SUMMARY:
Add support to allow the e2e tests run with the RHAIIS images in our CI/CD infrastructure:

This PR relies on the kuben8s infrastructure setup for our CI/CD. All the deployment related changes are in the llm-compressor-testing PR: https://github.com/neuralmagic/llm-compressor-testing/pull/174
The e2e tests with the RHAIIS images is only supported when running in our CI/CD so we can only run this mode using the workflows in the llm-compressor-testing repo. We don't change the way to run the e2e tests with wheels.
This PR also added an rhaiis-e2e-smoke.list file that lists the models as a smoke test for the e2e tests run with the RHAIIS images.
Added a new run_tests_in_rhaiis.sh to run the e2e tests with the RHAIIS images.

bash tests/e2e/vLLM/run_tests_in_rhaiis.sh -c tests/e2e/vLLM/e2e-smoke.list -t tests/e2e/vLLM/test_vllm.py -s <path to save results etc>

A successful run for the e2e tests using the RHAIIS images is here: https://github.com/neuralmagic/llm-compressor-testing/actions/runs/19676158533

TEST PLAN:
All tests

Signed-off-by: Dan Huang <[email protected]>

github-actions · 2025-11-13T23:49:16Z

👋 Hi! Thank you for contributing to llm-compressor. Please add the ready label when the PR is ready for review.

Note: This is required to complete the testing suite, please only add the label once the PR is code complete and local testing has been performed.

gemini-code-assist · 2025-11-13T23:49:23Z

Summary of Changes

Hello @dhuangnm, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the capabilities of the e2e testing framework by integrating support for running tests against vLLM instances deployed as RHAIIS images in a CI/CD context. It introduces a more flexible way to specify test configurations, including a new 'smoke test' option for quicker runs. The changes ensure that the testing infrastructure can adapt to different vLLM deployment scenarios, from local development to containerized environments, by intelligently managing environment variables, model saving paths, and the execution of vLLM commands, ultimately enhancing the robustness and versatility of the testing pipeline.

Highlights

RHAIIS Image Support: Added comprehensive support for running end-to-end (e2e) tests using RHAIIS images within the CI/CD environment, enabling testing against deployed vLLM instances.
E2E Smoke Testing: Introduced a new e2e-smoke.list file and updated run_tests.sh to allow running a smaller, focused subset of e2e tests, improving efficiency for quick validation.
Flexible Test Configuration: The run_tests.sh script now supports providing either a directory of model configurations or a file listing specific configurations, enhancing test execution flexibility.
VLLM Environment Management: Modified test_vllm.py to dynamically handle different vLLM environments, including local installations, separate virtual environments, and deployed RHAIIS images, via the VLLM_PYTHON_ENV variable.
Kubernetes Integration for VLLM Execution: When running with RHAIIS images, the _run_vllm method now generates a bash script and executes vLLM commands inside the Kubernetes pod using kubectl exec, streamlining CI/CD integration.
Conditional Hugging Face Uploads and Cleanup: Adjusted the test logic to prevent Hugging Face model uploads and to conditionally manage the cleanup of model save directories when tests are executed within a RHAIIS image environment.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces support for running e2e tests using RHAIIS images, which is a significant enhancement for the CI/CD pipeline. It also adds flexibility by allowing tests to be run from a configuration file list. The changes are well-structured. My review focuses on improving the robustness of the shell script handling file lists, adhering to Pythonic conventions, and refining the new logic for executing tests within a Kubernetes environment. I've suggested changes to prevent potential issues with filenames containing special characters, to improve code readability and maintainability by removing sentinel string values and duplicated code, and to ensure the kubectl command is used appropriately for non-interactive scripting.

gemini-code-assist · 2025-11-13T23:52:00Z

tests/e2e/vLLM/test_vllm.py

+            with open(self.vllm_bash, "w") as cf:
+                cf.write(f"""#!/bin/bash
+                    export HF_HUB_OFFLINE=0
+                    export VLLM_NO_USAGE_STATS=1
+                    {vllm_cmd}
+                    """)


The multiline f-string will preserve the leading whitespace from the source code, resulting in a shell script with incorrect indentation. This can lead to issues when the script is executed. It's better to construct the string without the extra indentation, for example by using string literal concatenation.

Suggested change

with open(self.vllm_bash, "w") as cf:

cf.write(f"""#!/bin/bash

export HF_HUB_OFFLINE=0

export VLLM_NO_USAGE_STATS=1

{vllm_cmd}

""")

with open(self.vllm_bash, "w") as cf:

cf.write(

"#!/bin/bash\n"

"export HF_HUB_OFFLINE=0\n"

"export VLLM_NO_USAGE_STATS=1\n"

f"{vllm_cmd}\n"

)

gemini-code-assist · 2025-11-13T23:52:00Z

tests/e2e/vLLM/run_tests.sh

+    CONFIGS=`ls "$CONFIG"`
+elif [ -f "$CONFIG" ]; then
+    echo "Config is provided as a file: $CONFIG"
+    CONFIGS=`cat "$CONFIG"`
+fi
+echo "$CONFIGS"
+
 # Parse list of configs.
-for MODEL_CONFIG in "$CONFIG"/*
+for MODEL_CONFIG in $(echo -e "$CONFIGS" | sed "s|^|${script_path}/configs/|")


Using ls and cat to build a list of files, and then iterating over it with an unquoted for loop is not robust. This will fail if any filenames contain spaces or other special characters due to word splitting. A safer approach would be to use a while read loop to process the configs line by line, which correctly handles such filenames.

gemini-code-assist · 2025-11-13T23:52:01Z

tests/e2e/vLLM/test_vllm.py

+RUN_SAVE_DIR=os.environ.get("RUN_SAVE_DIR", "none")
+# when using vllm image, needs to save the generated model
+if VLLM_PYTHON_ENV.lower() != "same" and (not Path(VLLM_PYTHON_ENV).exists()):
+    IS_VLLM_IMAGE = True
+    assert RUN_SAVE_DIR != "none", "To use vllm image, RUN_SAVE_DIR must be set!"


Using the string "none" as a sentinel value for the RUN_SAVE_DIR environment variable is not idiomatic. It's more Pythonic to use None. You can get this behavior by calling os.environ.get("RUN_SAVE_DIR") without a default value, which returns None if the variable is not set. Then, checks can be performed with is not None. This should be applied everywhere RUN_SAVE_DIR is used (e.g., line 92).

Suggested change

RUN_SAVE_DIR=os.environ.get("RUN_SAVE_DIR", "none")

# when using vllm image, needs to save the generated model

if VLLM_PYTHON_ENV.lower() != "same" and (not Path(VLLM_PYTHON_ENV).exists()):

IS_VLLM_IMAGE = True

assert RUN_SAVE_DIR != "none", "To use vllm image, RUN_SAVE_DIR must be set!"

RUN_SAVE_DIR=os.environ.get("RUN_SAVE_DIR")

# when using vllm image, needs to save the generated model

if VLLM_PYTHON_ENV.lower() != "same" and (not Path(VLLM_PYTHON_ENV).exists()):

IS_VLLM_IMAGE = True

assert RUN_SAVE_DIR is not None, "To use vllm image, RUN_SAVE_DIR must be set!"

gemini-code-assist · 2025-11-13T23:52:01Z

tests/e2e/vLLM/test_vllm.py

+        if RUN_SAVE_DIR != "none":
+            assert Path(RUN_SAVE_DIR).exists(), f"RUN_SAVE_DIR path doesn't exist: {RUN_SAVE_DIR}"
+            self.run_save_dir = RUN_SAVE_DIR
+            # RUN_SAVE_DIR overwrites config save_dir if specified
+            self.save_dir = os.path.join(RUN_SAVE_DIR, self.model.split("/")[1] + f"-{self.scheme}")

        if not self.save_dir:
            self.save_dir = self.model.split("/")[1] + f"-{self.scheme}"


The expression to generate the save directory name is duplicated. To follow the DRY (Don't Repeat Yourself) principle, you can calculate it once and store it in a variable for reuse. This improves maintainability.

Suggested change

if RUN_SAVE_DIR != "none":

assert Path(RUN_SAVE_DIR).exists(), f"RUN_SAVE_DIR path doesn't exist: {RUN_SAVE_DIR}"

self.run_save_dir = RUN_SAVE_DIR

# RUN_SAVE_DIR overwrites config save_dir if specified

self.save_dir = os.path.join(RUN_SAVE_DIR, self.model.split("/")[1] + f"-{self.scheme}")

if not self.save_dir:

self.save_dir = self.model.split("/")[1] + f"-{self.scheme}"

save_dir_basename = self.model.split("/")[1] + f"-{self.scheme}"

if RUN_SAVE_DIR != "none":

assert Path(RUN_SAVE_DIR).exists(), f"RUN_SAVE_DIR path doesn't exist: {RUN_SAVE_DIR}"

self.run_save_dir = RUN_SAVE_DIR

# RUN_SAVE_DIR overwrites config save_dir if specified

self.save_dir = os.path.join(RUN_SAVE_DIR, save_dir_basename)

if not self.save_dir:

self.save_dir = save_dir_basename

gemini-code-assist · 2025-11-13T23:52:01Z

tests/e2e/vLLM/test_vllm.py

+            logger.info("vllm image. Run vllm cmd with kubectl.")
+            result = subprocess.Popen(
+                [
+                 "kubectl", "exec", "-it",


The -it flags in kubectl exec are for allocating an interactive terminal session (pseudo-TTY). This is unnecessary for a non-interactive script and can cause issues in some CI environments where a TTY is not available. It's safer to remove them as they are not needed for this use case.

Suggested change

"kubectl", "exec", "-it",

"kubectl", "exec",

Signed-off-by: Dan Huang <[email protected]>

dsikka

I think we want this separate from our normal e2e testing.

Key things to change:

We should keep run_tests.sh as is and create a separate rhaiis image testing bash script
The vLLM test is already written in functions which we should pull in to a separate rhaiis-e2e tets

dhuangnm · 2025-11-17T19:22:35Z

I think we want this separate from our normal e2e testing.

Key things to change:

We should keep run_tests.sh as is and create a separate rhaiis image testing bash script

The vLLM test is already written in functions which we should pull in to a separate rhaiis-e2e tets

This doesn't affect our normal e2e testing. We don't change any existing ways to run the e2e tests, just adding a new way to allow the tests run with RHAIIS images. The changes to the run_tests.sh is to allow a smoke list file be used, it still supports its current way of using the configs folder and nothing is changed there either.

tests/e2e/vLLM/rhaiis-e2e-smoke.list

tests/e2e/vLLM/run_tests.sh

tests/e2e/vLLM/test_vllm.py

dsikka

I think we should do the following:

Separate the test into two steps: Model Generation (hanlded by set-up, run_oneshot_for_e2e_testing and uploading to the hub) following by vLLM Inference
Run the second step in either a vLLM image or using vLLM nightly. This will use the models generated from (1) and will allow separtion between upstream and midstream flows

I think most of the changes are in the vLLM side and this is also the most reflective of the actual flow we'd expect when using upstream and RHAIIS.

Signed-off-by: Dan Huang <[email protected]>

dhuangnm · 2025-11-25T17:55:22Z

I think we should do the following:

Separate the test into two steps: Model Generation (hanlded by set-up, run_oneshot_for_e2e_testing and uploading to the hub) following by vLLM Inference

Run the second step in either a vLLM image or using vLLM nightly. This will use the models generated from (1) and will allow separtion between upstream and midstream flows

I think most of the changes are in the vLLM side and this is also the most reflective of the actual flow we'd expect when using upstream and RHAIIS.

Thanks Dipika for the suggestions. Updated the code accordingly:

Separated functions in test_vllm.py to do each step for set-up, compress-model, save-compressed-model, and run-vllm;
The run-vllm will use either a vLLM image or nightly;
Added a new script run_tests_in_rhaiis.sh to run tests in rhaiis images specifically.
Removed the originally added RUN_SAVE_DIR env var and took advantage of the save_dir we already have in the model config files.

Please review again.

dhuangnm and others added 25 commits October 29, 2025 17:21

Allow using vllm image

f636e2c

Signed-off-by: Dan Huang <[email protected]>

fix a typo

afbf811

Signed-off-by: Dan Huang <[email protected]>

fix typo again

a55e5c8

Signed-off-by: Dan Huang <[email protected]>

fix an issue

ceee681

Signed-off-by: Dan Huang <[email protected]>

fix an issue

bcc7a50

Signed-off-by: Dan Huang <[email protected]>

fix cmd string

665cd1e

Signed-off-by: Dan Huang <[email protected]>

fix an issue

4bf0dc1

Signed-off-by: Dan Huang <[email protected]>

add debugging

59cea15

Signed-off-by: Dan Huang <[email protected]>

don't delete run folder if using image

be75c8d

Signed-off-by: Dan Huang <[email protected]>

allow using pulled image or deployed runner

586dcc1

Signed-off-by: Dan Huang <[email protected]>

fix a typo

c1dde7f

Signed-off-by: Dan Huang <[email protected]>

remove extra )

ae9e526

Signed-off-by: Dan Huang <[email protected]>

run vllm with podman

80352db

Signed-off-by: Dan Huang <[email protected]>

fix error

8461d03

Signed-off-by: Dan Huang <[email protected]>

fix issues

5704e62

Signed-off-by: Dan Huang <[email protected]>

fix path

098f561

Signed-off-by: Dan Huang <[email protected]>

improve output

d564408

Signed-off-by: Dan Huang <[email protected]>

fix typo

5da7eee

Signed-off-by: Dan Huang <[email protected]>

fix format

4cb2251

Signed-off-by: Dan Huang <[email protected]>

fix command

d2cb646

Signed-off-by: Dan Huang <[email protected]>

allow file to execute

5cdb543

Signed-off-by: Dan Huang <[email protected]>

minor update

6dc42c4

Signed-off-by: Dan Huang <[email protected]>

copy file

84634e0

Signed-off-by: Dan Huang <[email protected]>

fix issue

57c99ac

Signed-off-by: Dan Huang <[email protected]>

run vllm in deployed pod

7cdedbb

Signed-off-by: Dan Huang <[email protected]>

dhuangnm requested review from HDCharles, brian-dellabetta, dbarbuzzi, dsikka and kylesayrs November 13, 2025 23:49

gemini-code-assist bot reviewed Nov 13, 2025

View reviewed changes

dhuangnm added 8 commits November 14, 2025 15:57

missed ,

3951475

Signed-off-by: Dan Huang <[email protected]>

fix command

5c401fc

Signed-off-by: Dan Huang <[email protected]>

remove VLLM_VOLUME_MOUNT_DIR

870b6ee

Signed-off-by: Dan Huang <[email protected]>

fix missing path

d23bdf4

Signed-off-by: Dan Huang <[email protected]>

clean up

625c9db

Signed-off-by: Dan Huang <[email protected]>

final update

264fdcb

Signed-off-by: Dan Huang <[email protected]>

clean up

318bd3d

Signed-off-by: Dan Huang <[email protected]>

fix quality failures

117ec9d

Signed-off-by: Dan Huang <[email protected]>

dhuangnm force-pushed the imgtest branch from e5c21b9 to 117ec9d Compare November 14, 2025 20:59

dsikka requested changes Nov 17, 2025

View reviewed changes

dsikka reviewed Nov 17, 2025

View reviewed changes

tests/e2e/vLLM/rhaiis-e2e-smoke.list Show resolved Hide resolved

tests/e2e/vLLM/run_tests.sh Show resolved Hide resolved

tests/e2e/vLLM/test_vllm.py Outdated Show resolved Hide resolved

kylesayrs previously approved these changes Nov 17, 2025

View reviewed changes

tests/e2e/vLLM/test_vllm.py Show resolved Hide resolved

dsikka reviewed Nov 19, 2025

View reviewed changes

reorg test code and remove env var

8b41d5f

Signed-off-by: Dan Huang <[email protected]>

dhuangnm dismissed kylesayrs’s stale review via 8b41d5f November 24, 2025 21:59

dhuangnm added 6 commits November 25, 2025 09:42

fix error

1b2530e

Signed-off-by: Dan Huang <[email protected]>

fix another error

3d889c6

Signed-off-by: Dan Huang <[email protected]>

fix style

7e77202

Signed-off-by: Dan Huang <[email protected]>

clean up and fix format

7662699

Signed-off-by: Dan Huang <[email protected]>

fix format

abb6bab

Signed-off-by: Dan Huang <[email protected]>

rename file to be rhaiis specific

de58b02

Signed-off-by: Dan Huang <[email protected]>

Support RHAIIS images for the e2e tests #2032

Are you sure you want to change the base?

Support RHAIIS images for the e2e tests #2032

Conversation

dhuangnm commented Nov 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 13, 2025

Uh oh!

gemini-code-assist bot commented Nov 13, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 13, 2025

Choose a reason for hiding this comment

Uh oh!

dsikka left a comment

Choose a reason for hiding this comment

Uh oh!

dhuangnm commented Nov 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dsikka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dhuangnm commented Nov 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dhuangnm commented Nov 13, 2025 •

edited

Loading

dsikka left a comment •

edited

Loading